21 research outputs found

    Exascale Deep Learning for Climate Analytics

    Full text link
    We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively.Comment: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, US

    GPU peer-to-peer techniques applied to a cluster interconnect

    Full text link
    Modern GPUs support special protocols to exchange data directly across the PCI Express bus. While these protocols could be used to reduce GPU data transmission times, basically by avoiding staging to host memory, they require specific hardware features which are not available on current generation network adapters. In this paper we describe the architectural modifications required to implement peer-to-peer access to NVIDIA Fermi- and Kepler-class GPUs on an FPGA-based cluster interconnect. Besides, the current software implementation, which integrates this feature by minimally extending the RDMA programming model, is discussed, as well as some issues raised while employing it in a higher level API like MPI. Finally, the current limits of the technique are studied by analyzing the performance improvements on low-level benchmarks and on two GPU-accelerated applications, showing when and how they seem to benefit from the GPU peer-to-peer method.Comment: paper accepted to CASS 201

    GPU-optimized approaches to molecular docking-based virtual screening in drug discovery: A comparative analysis

    Get PDF
    Finding a novel drug is a very long and complex procedure. Using computer simulations, it is possible to accelerate the preliminary phases by performing a virtual screening that filters a large set of drug candidates to a manageable number. This paper presents the implementations and comparative analysis of two GPU-optimized implementations of a virtual screening algorithm targeting novel GPU architectures. This work focuses on the analysis of parallel computation patterns and their mapping onto the target architecture. The first method adopts a traditional approach that spreads the computation for a single molecule across the entire GPU. The second uses a novel batched approach that exploits the parallel architecture of the GPU to evaluate more molecules in parallel. Experimental results showed a different behavior depending on the size of the database to be screened, either reaching a performance plateau sooner or having a more extended initial transient period to achieve a higher throughput (up to 5x), which is more suitable for extreme-scale virtual screening campaigns

    circPVT1 and PVT1/AKT3 show a role in cell proliferation, apoptosis, and tumor subtype-definition in small cell lung cancer

    Get PDF
    Small cell lung cancer (SCLC) is treated as a homogeneous disease, although the expression of NEUROD1, ASCL1, POU2F3, and YAP1 identifies distinct molecular subtypes. The MYC oncogene, amplified in SCLC, was recently shown to act as a lineage-specific factor to associate subtypes with histological classes. Indeed, MYC-driven SCLCs show a distinct metabolic profile and drug sensitivity. To disentangle their molecular features, we focused on the co-amplified PVT1, frequently overexpressed and originating circular (circRNA) and chimeric RNAs. We analyzed hsa_circ_0001821 (circPVT1) and PVT1/AKT3 (chimPVT1) as examples of such transcripts, respectively, to unveil their tumorigenic contribution to SCLC. In detail, circPVT1 activated a pro-proliferative and anti-apoptotic program when over-expressed in lung cells, and knockdown of chimPVT1 induced a decrease in cell growth and an increase of apoptosis in SCLC in vitro. Moreover, the investigated PVT1 transcripts underlined a functional connection between MYC and YAP1/POU2F3, suggesting that they contribute to the transcriptional landscape associated with MYC amplification. In conclusion, we have uncovered a functional role of circular and chimeric PVT1 transcripts in SCLC; these entities may prove useful as novel biomarkers in MYC-amplified tumors.</p

    Border and Identity Negotiations: An Analysis of 19th Century Migrations in the U.S

    Get PDF
    Durch Archivrecherchen in Bibliotheken in den USA und Mexiko deckt Lim die verborgene Geschichte von Rassenkategorien und Reisen auf, die weitgehend aus dem nationalen Bewusstsein beider LĂ€nder gelöscht wurden, indem sie die Offenheit der Grenze von den 1880er Jahren bis zur ethnischen Differenzierung in den 1930er Jahren verfolgt. Durch Gesetze, Richtlinien und Erlasse erzwangen beide Staaten nationale und ‚rassische‘ Einheitlichkeit, was wiederum die FĂ€higkeit der Migranten einschrĂ€nkte, diese Grenzen zu ĂŒberschreiten. Doch die Einwanderer nutzten verschiedene Überlebenstechniken, z.B. die politischen Instrumente des Staates, nahmen unterschiedliche IdentitĂ€ten an und manövrierten ihre AnsprĂŒche auf StaatsbĂŒrgerschaft und Zugehörigkeit, je nachdem es ihre Situation zugelassen hat. Lims historische Darstellung lĂ€sst uns daher ĂŒber die herausfordernde Geschichte der multiethnischen Migration nachdenken. Sehr deutlich werden auch die Auswirkungen von Möglichkeitsstrukturen bei der EinschrĂ€nkung oder Ausweitung der MobilitĂ€t von Migranten.Through archival research in libraries in the U.S and Mexico, Lim reveals the hidden history of racial categories and journeys that have been largely erased from both countries’ national consciousness by tracing the racial openness of the border from the 1880s to racial differentiation by the 1930s. Through laws, policies, and enactments, both states enforced national and racial uniformity, in turn, limiting migrants’ ability to cross these boundaries. Yet the immigrants employed the state’s political instruments, survival techniques, taking up different identities, maneuvering their claims to citizenship and belonging as different situations arise. Lim’s historical account stimulates a reflection on the challenging history of multiracial migration and the influence of opportunity structures in limiting or expanding immigrants’ mobility

    Colloquium: Large scale simulations on GPU clusters

    No full text
    Graphics processing units (GPU) are currently used as a cost-effective platform for computer simulations and big-data processing. Large scale applications require that multiple GPUs work together but the efficiency obtained with cluster of GPUs is, at times, sub-optimal because the GPU features are not exploited at their best. We describe how it is possible to achieve an excellent efficiency for applications in statistical mechanics, particle dynamics and networks analysis by using suitable memory access patterns and mechanisms like CUDA streams, profiling tools, etc. Similar concepts and techniques may be applied also to other problems like the solution of Partial Differential Equations
    corecore